AITopics | local step

Collaborating Authors

local step

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LOSCAR-SGD: Local SGD with Communication-Computation Overlap and Delay-Corrected Sparse Model Averaging

Maziane, Yassine, Mahran, Ammar, Maranjyan, Artavazd, Richtárik, Peter

arXiv.org Machine LearningMay-21-2026

Communication is a major bottleneck in distributed learning, especially in large-scale settings and in federated learning environments with slow links. Three standard ways to reduce this cost are communication compression, local training, and communication-computation overlap. Methods that combine these ingredients are used in practice and have been found to be effective for large-scale training, but there is little theory for methods that combine all three. We study a heterogeneous-compute setting in which different workers may take different numbers of local steps, and we propose LOSCAR-SGD, a Local SGD method that communicates only a sparse subset of model coordinates and continues optimizing while communication is in flight. A key ingredient is a delay-corrected merge rule that incorporates delayed synchronized information without discarding the progress made during the overlap phase. We give convergence guarantees for smooth non-convex objectives and show how sparsity, overlap, and worker heterogeneity affect the rate. To the best of our knowledge, this is the first theory for this combination of ingredients. Experiments further show that communication-computation overlap reduces training time and that the delay-corrected merge outperforms naive overwriting.

artificial intelligence, communication, machine learning, (18 more...)

arXiv.org Machine Learning

2605.20866

Country: North America (0.28)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

362c99307cdc3f2d8b410652386a9dd1-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 11:19:41 GMT

artificial intelligence, machine learning, probability, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Zeroth-Order Methods for Nondifferentiable, Nonconvex, and Hierarchical Federated Optimization

Neural Information Processing SystemsApr-24-2026, 14:57:45 GMT

Federated learning (FL) has emerged as an enabling framework for communicationefficient decentralized training. We study three broadly applicable problem classes in FL: (i) Nondifferentiable nonconvex federated optimization; (ii) Federated bilevel optimization; (iii) Federated minimax problems. Notably, in an implicit sense, both (ii) and (iii) are instances of (i). However, the hierarchical problems in (ii) and (iii) are often complicated by the absence of a closed-form expression for the implicit objective function. Unfortunately, research on these problems has been limited and afflicted by reliance on strong assumptions, including the need for differentiability and L-smoothness of the implicit function. We address this shortcoming by making the following contributions. In (i), by leveraging convolution-based smoothing and Clarke's subdifferential calculus, we devise a randomized smoothing-enabled zeroth-order FL method and derive communication and iteration complexity guarantees for computing an approximate Clarke stationary point. To contend with (ii) and (iii), we devise a unified randomized implicit zeroth-order FL framework, equipped with explicit communication and iteration complexities. Importantly, our method utilizes delays during local steps to skip making calls to the inexact lower-level FL oracle.

artificial intelligence, machine learning, optimization, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

f9379afacdbabfdc6b060972b60f9ab8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 22:47:34 GMT

algorithm, arxiv preprint arxiv, international conference, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > Canada > Ontario > Toronto (0.04)
(4 more...)

Genre: Research Report (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Communications > Networks (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Breakingthecentralizedbarrierforcross-device federatedlearning

Neural Information Processing SystemsFeb-11-2026, 20:41:29 GMT

Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which can cause a client drift phenomenon.

artificial intelligence, arxivpreprintarxiv, machine learning, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.05)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

362c99307cdc3f2d8b410652386a9dd1-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 05:35:06 GMT

nullq, probability, quantization, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

362c99307cdc3f2d8b410652386a9dd1-Paper.pdf

Neural Information Processing SystemsFeb-8-2026, 05:35:03 GMT

communication, node, quantization, (14 more...)

Neural Information Processing Systems

Country:

Europe > Austria (0.04)
North America > United States > Massachusetts (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

Zeroth-Order Methods for Nondifferentiable, Nonconvex, and Hierarchical Federated Optimization

Neural Information Processing SystemsFeb-7-2026, 16:14:16 GMT

Federated learning (FL) has emerged as an enabling framework for communication-efficient decentralized training.

artificial intelligence, machine learning, optimization, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Pennsylvania (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated Learning

Redie, Dawit Kiros, Arablouei, Reza, Werner, Stefan

arXiv.org Machine LearningJan-29-2026

Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalled progress in the early rounds. We propose step-ahead partial error feedback (SA-PEF), which integrates step-ahead (SA) correction with partial error feedback (PEF). SA-PEF recovers EF when the step-ahead coefficient α = 0 and step-ahead EF (SAEF) when α = 1. For non-convex objectives and δ-contractive compressors, we establish a second-moment bound and a residual recursion that guarantee convergence to stationar-ity under heterogeneous data and partial client participation. To balance SAEF's rapid warm-up with EF's long-term stability, we select α near its theory-predicted optimum. Experiments across diverse architectures and datasets show that SA-PEF consistently reaches target accuracy faster than EF. Modern large-scale machine learning increasingly relies on distributed computation, where both data and compute are spread across many devices. Federated learning (FL) enables model training in this setting without centralizing raw data, enhancing privacy and scalability under heterogeneous client distributions (McMahan et al., 2017; Kairouz et al., 2021). In each synchronous FL round, the server broadcasts the current global model to a subset of clients. These clients perform several steps of stochastic gradient descent (SGD) on their local data and return updates to the server, which aggregates them to form the next global iterate (Huang et al., 2022; Wang & Ji, 2022; Li et al., 2024). Although FL leverages rich distributed data, it faces two key challenges.

artificial intelligence, compression, machine learning, (17 more...)

arXiv.org Machine Learning

2601.20738

Country: